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Abstract 

The unification algorithm is at the core of the logic programming paradigm, the first unification 
algorithm being developed by Robinson [5] . More efficient algorithms were developed later [4] and 
I introduce here yet another efficient unification algorithm centered on a specific data structure, 
called the Unification Table. 

1 Introduction 

The unification algorithm is at the heart of the logic programming paradigm [3]. Starting with 
the classic algorithm of Robinson [5] , the unification algorithm was developed to become more and 
more efficient [4] . Even nowadays the unification theory is still under development and is receiving 
continuous scrutiny from the scientific community [2]. 

The present paper presents yet another efficient unification algorithm centered on a data struc- 
ture called Unification Table, which borrows some ideas from the data structures used by the 
Warren's Abstract Machine [1]. 

The next paragraph presents in detail the proposed unification algorithm, giving the C-style 
pseudo code. An example of application of the algorithm taken from [1] is also presented. 



2 The Unification Algorithm 



The unification algorithm below will compute the MGU of two logical terms x and y if the unifica- 
tion is possible, otherwise will report failure. As in all Prolog implementations of the unification, 
the "occur check" is omitted; however, with some modifications, the algorithm can implement a 
"unify with occur check" procedure. 

The algorithm is centered on a data structure called the Unification Table (UT) which contains 
information related to each subterm (constant, variable, composite) that occurs in the terms to be 
unified. The properties of the unification table are crucial to the correctness and the efficiency of 
the unification and will be described in detail below. 

The algorithm consists of two main steps: 

Step 1. Parse terms x and y and build the Unification Table (UT) 

Step 2. Call the unification function unify (index(x), index(y)) where index(x) and index(y) are 
the indexes of x and y in the UT. 

In the following we will take a closer look to each step of the unification algorithm, providing 
detailed explanation wherever necessary. 



2.1 Step 1 

The idea of this step is to build a data structure, called Unification Table (UT), in which every 
variable appears only once, and all the subterms of p and q are included. The UT contains three 
types of entries: variables (type VAR, arity 0), constants (type STR, arity 0) and composite terms 
(type STR, arity greater than 0). 
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The structure of the UT for each type of entry is the following: 

whore: 

Term - is the actual term; this column is never built; its purpose is only to simplify the 
explanations and the understanding of the algorithm. We can think of it as the result of a "write" 
procedure called upon the index of the term. 

Index - is the index of the table entry for some term; starts with and uniquely identifies the 
term. 

Main functor - is the main functor of the term. 

Type is VAR for variables and STR for constants and composite terms. 
Arity - is the arity of the term; for variables and constants, it is 0. 
For variables and constants, the list of components is the empty list. 

For composite terms, the list of components is the sequence of the indexes of the component 
subterms; the order is important - it can be either left to right or vice versa but not both of them. 
The parsing procedure is left to right and ensures that when an entry is created for a composite 
term, all its subterms are already in the UT. The parsing starts with the second term (y). 

Consider the following example from Ait-Kaci [1] requiring to find the MGU of: 

X = p(Z,h(Z,W),f(W)) 

y = p(f(X),h(Y,f(a)),Y) 

The unification table will be: 
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The table is filled by parsing the terms x and y from right to left starting with y in a bottom 
up manner. Each variable has exactly one entry in the table but same constants or composite 
terms may have different entries. The list of components for a given term consists of the indexes 
of its components. The unification function will start with the indexes of x and y, that is 6 and 
11, so the call will be unify (6, 11). 

2.2 Step 2 

The unify function called in Step 2 starts with the indexes of x and y and uses two stacks Sx 
and Sy. Initially the index of x and y are pushed on the stacks 5a; and Sy respectively. Then a 
main loop will start that will continue until both stacks arc empty. The algorithm ensures that 
both stacks will be empty simultaneously and that the stacks will eventually become empty if 
the unification table was built correctly and the terms follow the correct syntax for logic terms. 
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The function will return either SUCCESS or FAIL; in case of success each variable included in 
the MGU will be marked using a global data structure. The pseudo code for the unify function 
follows: 

function unify (ix, iy) 

initialize stack Sx 
initialize stack Sy 
push ix on stack Sx 
push iy on stack Sy 

while (not_empty (Sx) and not .empty (Sy) ) 
{// start main loop 
pop i from Sx 
pop j from Sy 

// case 1 : i is bound to a term and j is bound to a term} 
if (typed) == STR and type(j) == STR) 

if (main functors of i and j match (both name and arity) ) 
if (arity > 0) 

push components of i on Sx in sequence 
push components of j on Sy in sequence 

else 

return (FAIL) // report failure 
// case 2: i is boimd to a term and j is bound to a variable 
if (typed) == STR and type(j) == VAR) 
if (j is a free variable) 

bind j to i and set mgu[j] = 1 
else // j is bound 
dereference j 
if (j is bound to a STR) 
push i on Sx 
push j on Sy 
else // j is bound to a free variable 
bind j to i 

// case 3: i is bound to a variable and j is boimd to a term 
if (type(i) == VAR and type(j) == STR) 

// perfectly symmetric to case 2 
// case 4: i is bound to a variable and j is bound to a variable 
if (typed) == VAR and type(j) == VAR) 
if (i is free and j is free) 

bind i to j (or vice versa) and set mgu[i] = 1 
if (i is free and j is bound) 

bind i to j and set mgu[i] = 1 
if (i is bound and j is free) 

bind j to i and set mgu[j] = 1 
if (i is bound and j is bound) 

push the index of the term to which i is boimd on Sx 
push the index of the term to which j is boimd on Sy 
} // end main loop 
return (SUCCESS) 
} // end function unify 

In order to help the understanding of the algorithm, C++ style comments in italics are provided. 
The MGU for the above example computed with this function is: 

W=f(a) 
X=f(a) 
Y=f(f(a)) 
Z=f(f(a)) 

As one can see from the above function, the main loop extracts elements i and j from the two 
stacks and then appropriate action is taken according to the four possible cases. 
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In the first case i and j are indices of two elements of type STR so tlie natural action to do 
here is to check if they have the same name and arity. If so, in case the arity is greater than zero 
we push all the components of i and j on the two stacks, otherwise we report failure. 

Cases two and three arc symmetric and consider the case when cither i or j arc variables but 
not both of them. In this case we have to discriminate between the cases when the variable is free 
(we just have bind it) or bound (in this case we have to dereference the variable and push the 
result). 

Finally in case four we have to deal with two variables and again we have to consider four 
subcases according to the status of the variables: free or bound. As before, if at least one variable 
is free we just have to bind it, otherwise we must push the terms to which the variables are bound 
on the stacks and continue the loop. 

If the "while" loop ends naturally, without the forced return in case of failure, then the function 
returns "success" as well as the MGU. 

In the case of an "occur check" violation the algorithm will succeed, but an attempt to print 
the result will result in an infinite loop, or a memory overflow; this can be solved as in many Prolog 
systems by using a "guarded write" which will go only (say) ten levels in depth. 

3 Conclusions 

The quest for efficient unification algorithms is the foundation of increasing the efficiency of logic 
programs. Since the first unification algorithm of Robinson [5], continuing with the efficient algo- 
rithm of Martelli and Montanari [4], this quest never stopped. 

The unification algorithm presented in this paper is yet another attempt to increase the ef- 
ficiency of the unification. The algorithms also benefits of simplicity and clarity which makes it 
very easy to understand and implement. 

An implementation in C is available as well as a Java applet. Further developments of the 
algorithm are: a recursive version of the algorithm and a version with "occur check". 
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